Annotation data handlers for data stream processing

ABSTRACT

A method and apparatus for annotation data handlers for data stream processing. An embodiment of a method for processing annotations in computer files includes receiving a data stream input at a scanner component, where the data stream input represents program elements of one or more computer files. The data stream input is scanned for annotations, and a data stream is generated comprising annotated program elements and associated annotation values. The annotated program elements and annotation values are provided as an input to a handler component. The handler component performs one of filtering the annotated program elements and annotation values data, or echoing the annotated program elements and annotation values, and the handler component generates an output.

RELATED APPLICATIONS

This application is related to and claims priority to U.S. provisional patent application 60/953,938, filed Aug. 3, 2007.

This application is further related to:

-   -   U.S. patent application Ser. No. 11/648,065, entitled “Computer         File System Traversal”, filed Dec. 30, 2006;     -   U.S. patent application Ser. No. ______, entitled “Computer         Computer Archive Traversal”, attorney docket 6570P472, filed         Aug. 1, 2008, claiming priority to U.S. provisional application         60/953,932, filed Aug. 3, 2007;     -   U.S. patent application Ser. No. ______, entitled “Computer File         Processing”, attorney docket 6570P473, filed Aug. 1, 2008,         claiming priority to U.S. provisional application 60/953,933,         filed Aug. 3, 2007;     -   U.S. patent application Ser. No. ______, entitled “Annotation         Processing of Computer Files”, attorney docket 6570P474, filed         Aug. 1, 2008, claiming priority to U.S. provisional application         60/953,935, filed Aug. 3, 2007;     -   U.S. patent application Ser. No. ______, entitled “Annotation         Data Filtering of Computer Files”, attorney docket 6570P475,         filed Aug. 1, 2008, claiming priority to U.S. provisional         application 60/953,937, filed Aug. 3, 2007;     -   U.S. patent application Ser. No. ______, entitled “Dependency         Processing of Computer Files”, attorney docket 6570P492, filed         Aug. 1, 2008, claiming priority to U.S. provisional application         60/953,963, filed Aug. 3, 2007; and     -   U.S. patent application Ser. No. ______, entitled “Data         Listeners for Type Dependency Processing”, attorney docket         6570P493, filed Aug. 1, 2008, claiming priority to U.S.         provisional application 60/953,964, filed Aug. 3, 2007.

TECHNICAL FIELD

Embodiments of the invention generally relate to the field of computer systems and, more particularly, to a method and apparatus for annotation data handlers for data stream processing.

BACKGROUND

Computer files, such as Java class files, may have specific standard formats. The standard formats of computer may limit the data that can be provided in relation to the files. For this reason, annotations may be provided to add additional information regarding computer files. Annotations may potentially be found anywhere within a set of computer files.

In a particular example, Java allows annotations to Java class files, with the practice now being specifically described in annotations under Java release 5.0 (Java under the JDK (Java Development Kit) 5.0) as provided in JSR-175 recommendation regarding code annotations. The annotations may add guidance regarding certain class files. Thus, a Java class file may include one or more annotations associated with program elements.

It may become necessary or useful to filter program files to obtain annotations that are present in the files. However, the filtering of the program files may require a significant amount of processing time, and the process may not be easily adaptable to dynamic changes in filtering requirements.

SUMMARY OF THE INVENTION

A method and apparatus are provided for annotation data handlers for data stream processing.

In a first aspect of the invention, an embodiment of a method for processing annotations in computer files includes receiving a data stream input at a scanner component, where the data stream input represents program elements of one or more computer files. The data stream input is scanned for annotations, and a data stream is generated comprising annotated program elements and associated annotation values. The annotated program elements and annotation values are provided as an input to a handler component. The handler component performs filtering of the annotated program elements and annotation values data, or echoing the annotated program elements and annotation values, and the handler component generates an output.

BRIEF DESCRIPTION OF THE DRAWINGS

Embodiments of the invention are illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings in which like reference numerals refer to similar elements.

FIG. 1 is an illustration of embodiments of data handlers for annotation processing;

FIG. 2 is an illustration of an embodiment of a combination of handler modules for annotation data stream processing;

FIG. 3 is an illustration of an embodiment of processing of computer file data;

FIG. 4 is an illustration of a computer file processing system;

FIG. 5 is an illustration of an embodiment of a system to process annotated program elements;

FIG. 6 is a flowchart to illustrate an embodiment of the scanning of a serial data stream for annotations to class file elements;

FIG. 7 is an illustration of an embodiment of a class file program element as a data stream;

FIG. 8 is an illustration of an embodiment of an annotation to a class file program element presented as a data stream;

FIG. 9 is an embodiment of library utilities;

FIG. 10 is an illustration of a computer system in an embodiment of the invention; and

FIG. 11 illustrates an embodiment of a client-server network system.

DETAILED DESCRIPTION

Embodiments of the invention are generally directed to annotation data handlers for data stream processing.

As used herein:

“Annotation” means additional information or metadata that is associated with or attached to a particular point in a computer program. The term annotation includes formal and informal annotation methods, including, but not limited to, annotations under Java release 5.0 (Java under the JDK (Java Development Kit) 5.0) as provided in JSR (Java Specification Request) 175 recommendation regarding code annotations (“A Metadata Facility for the Java Programming Language”).

“Data type” means a classification of software artifacts stored in a computer file. Data type includes, but is not limited to, a type of a Java class file.

“Computer file” means any file structure used in a computer system. Computer files include files with specific required structures, including Java class files.

“Class file” means a Java class file. A Java class file is a defined format for compiled Java code, which may then be loaded and executed by any Java virtual machine. The format and structure for a Java class file is provided in JSR 000202, Java Class File Specification Update (Oct. 2, 2006) and subsequent specifications.

“Traversal” means a process for progressing through the elements of a computer system, including a process for progressing through the elements of a computer archives

“Archive” means a single file that may contain one or more separate files. An archive may also be empty. The files within an archive are extracted, or separated, from the archive for use by a computer program. The files contained within an archive are commonly compressed, and the compressed files are decompressed prior to use. An archive may further include data required to extract the files from the archives “Archive” may also refer to act of transferring one or more files into an archives

In an embodiment of the invention, processing of computer files to identify annotations is provided. In an embodiment, the computer files are in the form of a serial data stream, with the data stream being scanned for annotation occurrences. In one embodiment, a set of computer files includes a set of Java class files, where the Java class files include one or more annotations. In an embodiment of the invention, the processing of the data stream includes the use of one or more handler modules to process annotation data.

The Java platform has included various ad hoc mechanisms to provide annotations. Java release 5.0 (Java under the JDK (Java Development Kit) 5.0) includes a formal general purpose annotation (or metadata) facility, as provided in the JSR-175 recommendation regarding code annotations. The annotation facility permits a user to define and use the user's own annotation types. The facility includes a syntax for declaring annotation types, a syntax for annotating declarations, APIs for reading annotations, a class file representation for annotations, and an annotation processing tool. Annotations generally allow programmers to add attributes to computer code, including Java code. These attributes may be used for multiple purposes, including code documentation, code generation, and, during runtime, for providing special services such as enhanced business-level security or special business logic.

In an embodiment of the invention, annotation data is extracted into a neutral format to allow efficient filtering of annotations of interest. The annotations in a data stream are provided in a form to allow a scanner to address the elements in a data stream. In an embodiment, a type definition appears prior to any annotations on the elements of the particular type. In an embodiment, there is further an annotation announcement made directly before reporting the annotation value, thus informing a scanner to be forewarned that an annotation will follow.

In an embodiment of the invention, a process is provided to receive a data stream input representing a computer program or other computer files, scan the data stream for instances of annotations, and to generate a data stream output. The data stream output includes selected annotated elements and annotation values.

In an embodiment of the invention, a processing system includes a scanning module to scan the data stream input and output a data stream that includes selected annotated elements and annotation values. In an embodiment, the processing system further includes a handler module to handle the data stream output. In an embodiment, the handler module further provides feedback to the scanning module to direct the selection of annotations by the scanning module. In an embodiment, the file processor may include additional handlers to provide other functions. In one embodiment of the invention, a dedicated, independent processing module is provided for annotation processing, but embodiments of the invention are not limited to a dedicated module implementation.

In an embodiment of the invention, a processing system includes one or more handler modules for the processing of a data stream, with the handler modules including one or more of the following:

(1) A stream processing handler; or

(2) An echo utility handler.

In an embodiment of the invention, a stream processing handler is a module that includes one or more filter parameters that establish a filter condition. The filter condition is imposed on a filter element. In an embodiment, the stream processing handler receives a data stream of annotated program elements and annotation values, and outputs a filtered data stream of annotated program elements and annotation values. In an embodiment, the stream processing handler generates a serial data filter in the same format as the serial data input. The stream processing handler further generates a callback control flow to control the operation of a data scanner that provides the data stream to the handler.

In an embodiment of the invention, an echo utility handler is a module that echoes, or prints, the data elements of a data stream input so that the annotation data may be read or processed. In one embodiment, an annotation echo module produces a text stream output from a data stream input including annotated program elements and annotation values, thereby providing a user with a readable output of annotation information. In a second embodiment, the annotation echo module produces annotation data in a form that may be provided to external processing, including, but not limited, comma-separated type definition and usage values or another common format.

In an embodiment of the invention, the stream processing handler receives data stream input of type definition and type usage data, and produces a data stream output of type definition and type usage data. In an embodiment, the input format and the output format are the same for stream processing handler. In this manner, multiple annotation data handler modules may be placed in series. In one example, a data stream input may be received by a stream processing handler, which filters out unwanted annotation data and produces a filtered data stream output of annotated program elements and annotation values. The filtered data stream output then may be utilized as a data stream input for an echo utility, which converts the data stream into a text stream output for user analysis or into a form for external processing.

In an embodiment of the invention, a scanner operates by sending data to the handler as it is identified in the data stream. In an embodiment, a type definition is thus received before receiving any of the program elements within the type, and thus also prior to any annotation value for the program elements within the type. In this manner, the handler may receive a type definition and may provide a callback to the scanner if the handler is not interested in annotations for the type definition. If the handler indicates that it is not interested in any annotations for the type definition, the scanner may then skip any annotation values for the type definition.

In an embodiment of the invention, a set of computer files are processed in a single pass as a serial data stream without requiring multiple readings of the file data. In an embodiment, the same serial data stream format is maintained both on input and output, thereby allowing further processing of class files without further file conversion. In an embodiment, the same data format is used for the data input and the data output. In an embodiment, the data stream conversion allows processing without any dependency on random access files, and broadens the applicable scope of the process for the input. In an embodiment, the processing of class files as a data stream allows processing without requiring use of, for example, Java library utilities that may normally be required to conduct the file processing.

In an embodiment of the invention, the conversion of computer files to a data stream allows for the use of a protocol for both the data producer (the computer file processor) and the data consumer without creating a complete file representation, thereby simplifying the data structure. In an implementation for Java class files, the processing system operates with a class file data model, without requiring the addition of any major abstraction for data processing.

In an embodiment, the conversion of computer files to a serial data format may include, but is not limited to, the operation of a traversal of a hierarchical data structure or of a data archive as provided respectively in patent application Ser. No. 11/648,065, entitled “Computer File System Traversal”, filed Dec. 30, 2006. Other processes for conversion of a set of files to a serial data stream may also be utilized in embodiments of the invention.

In an embodiment of the invention, for the processing of computer files it is assumed that processing occurs on an inner loop for critical processing stages. In an embodiment, a system requires high performance for inner loop of class file processing itself.

In an embodiment of the invention, processing is designed to provide sufficient performance for overall computer file processing. For example, in an embodiment a system includes stream buffering to buffer data as it is obtained and processed. In addition, an embodiment of the invention provides a compact internal file state in the data stream, thereby minimizing the amount of data that will be required in the process of transferring and processing the computer files.

In an embodiment of the invention, a data scanner may be provided in multiple implementations, depending on the system requirements. A data scanner may be a portion of a file processor. In one example, native processing implementations may be provided for a computer file scanner, with the native implementations being based upon relevant Java standards. In another example, a non-native implementation may be provided, as required. A particular non-native implementation may include a BCEL (Byte Code Engineering Library) implementation, with the BCEL API being a toolkit for the static analysis and dynamic creation or transformation of Java class files.

In an embodiment of the invention, a data consumer that receives generated annotation output data is a main framework extension point for which neutral utility implementations might be required. In an embodiment of the invention, a file processor (the data producer) operates using the same data protocol as the data consumer protocol. In an embodiment of the invention, the data consumer may have control over the data to be provided to the data consumer. In an embodiment, the data producer and the data consumer may cooperate to agree on the data to be provided from the serial data stream. In an embodiment of the invention, a system may include complexity control, including configuring the file processor to deliver the data of interest. In an embodiment, the data of interest includes data meeting a certain degree of detail, or certain types of data. In an embodiment of the invention, the structure of the data processing may allow for a system to be utilized with loose semantics and implementation constraints. For example, the technical framework and protocol data types may be defined. However, there may be leeway for implementation characteristics, such as the result order sequence and analysis capabilities.

In an embodiment of the invention, file processing may be included within a set of tools that are provided to search files. The tools may, for example, provide for the conversion of files into serial form by a traversal process, the scanning of data for desired elements, and other related processes.

FIG. 1 is an illustration of embodiments of data handlers for annotation processing. In an embodiment, a data handler receives a serial data stream input of annotated elements and annotation values 105. The serial data stream may be received from a data scanner, which has scanned a data stream to identify annotated elements and annotation values.

In first embodiment of the invention, the data handler may be an echo utility handler module 110 to produce data for examination by a user or for external processing. In this illustration, an echo utility includes an annotation echo element 115 to receive the contents of the data stream input 105 and to print such data to put the data in a form for examination or external processing. For example, the annotation echo element 115 may print the input data stream 105 into a text stream 120. The data of the text stream can then be provided to a user 130, who may analyze the data to reach conclusions regarding the original computer files. In a second example the annotation echo element 115 may print the data stream input 105 into, for example, comma separated values 125 that may be used in external processing 135.

In a second embodiment of the invention the data handler may be a stream processing handler module 140. In this illustration, the stream processing handler module 140 includes an annotation filter 145 to filter out unwanted annotation data. The annotation filter 145 may operate based on a filter condition 160. The filter condition 160 may utilize certain parameters 165 that may be set to establish filter operations. For example, the parameters may be derived from a configuration for the annotation handler module. The annotation filter may further include a temporary storage 150 for use in holding annotation data temporarily while the data is filtered. The annotation filter 140 produces a data stream output 175, which in this embodiment is a filtered data stream of annotated elements and annotation values in which unwanted annotation data has been filtered out. In addition, the stream processing handler module 140 may produce a callback control flow to control the annotation elements chosen for the data stream input 105.

FIG. 2 is an illustration of an embodiment of a combination of handler modules for annotation data stream processing. In this illustration, multiple data handlers may be utilized in series to process a received data stream. For example, a data stream of annotated data elements and annotation values 205 may be received, such as from a scanning module that is scanning a data stream for annotation data. The data is received by a first handler module, data handler 1 210, which may, for example be a stream processing handler, such as the stream processing handler module 140 illustrated in FIG. 1. The data handler 1 210 produces a data stream output of annotated data elements and annotation values 215, which may represent a filtered data stream in which unwanted data type definitions and type usages have been filtered out. The data stream output 215 becomes a data stream input for a second handler module, data handler 2 220. Data handler 2 220 may be, for example, an echo handler utility, such as the echo handler utility module 110 shown in FIG. 1. The echo handler utility converts the data stream input 215 by printing such data, such as in the form of a text stream for a user or a stream of formatted (comma-separated) data for external processing.

FIG. 3 is an illustration of an embodiment of processing of computer file data. In this illustration, a computer file conversion module 305 is provided to convert computer file data into a serial data stream 310. The computer file data may be, but is not limited to, Java class file program elements. The conversion of the computer file data may include, but is not limited to, the traversal of a hierarchical file or archives The output of the processing of computer file data is a serial data stream 310 representing the computer file data.

In an embodiment, the serial data stream includes one or more annotations. For example, the data stream 310 is illustrated as a series of program elements arriving as a data stream 330. In this data stream, there is a type definition prior to any elements within the type, and an annotation announcement occurs prior to any annotations. For example, Type1 335 is a first type definition, which is followed by program element Element1 340 within Type1. Element1 340 is associated with an annotation, with ANNO1 345 being a first annotation descriptor for annotation value AnnoValue1 350. The data stream further includes a second type definition Type2 355, which includes program elements Element2 360 and Element3 365. Element3 365 is associated with a second annotation, as shown by second annotation descriptor ANNO2 370 and annotation value AnnoValue2 375.

In an embodiment of the invention, the serial data stream 310 then is provided to a data scanner 315, which processes the data, including scanning the data stream for program elements of interest, including annotations to the program elements within the data stream. The scanner 315 may contain multiple modules or sub-modules, depending on the particular embodiment. The scanner 315 outputs an extracted data stream 320, which represents elements of the data stream that have been selected by the scanner 315. In this implementation, the extracted data stream would contain the annotated program elements and associated annotation values. The extracted data stream 320 then is eventually provided to a data consumer 325. The consumer 325 may receive additional reports or data processing as required for the needs of the consumer 325.

FIG. 4 is an illustration of a computer file processing system 400. While this illustration shows the processes occurring within a single system for simplicity in description, the processes may occur in multiple systems, including multiple systems within a network. In this illustration, a computer file data stream input 405 is provided to a file processor 410, which may include a scanner to scan the data for desired program elements. The data stream 405 may, for example, represent Java class file data that has been converted into a serial data stream. The file processor 410 may include multiple components, depending on the particular embodiment of the invention. The file processor 410 generates an extracted computer file data stream 415, which may be presented to a data consumer 420.

In an embodiment of the invention, the operation of the computer file processing system 400 is directed by certain inputs and settings. The operation of the file processor 410 may be directed by a scanner configuration 425. In addition, a data mode configuration 430 affects both the file processor 410 and the data consumer 420. The file processor 410 also may include one of multiple implementations. In particular embodiments, the implementation may be a native implementation 435 or a BCEL (Byte Code Engineering Library) implementation 440. The BCEL implementation 440 may include the Apache BCEL process 445, as developed by the Apache Software Foundation. In addition, the consumer 420 may utilize a framework utility 450 and a framework extension 455 in the operation of the computer file processing.

FIG. 5 is an illustration of an embodiment of a system to process annotated program elements. The system 500 may include a data scanner 510 and a data handler 520. The data scanner 510 may, for example, represent the file processor 410 illustrated in FIG. 4 or a subpart of the file processor 410. The data handler 520 may represent the data consumer 420 illustrated in FIG. 4 or a subpart of the data consumer 420. In this illustration, the data scanner 510 is to scan a received data stream input 505 for annotations, and to produce a data stream containing selected annotated program elements and annotation values 515. The data handler 520 is to receive and handle the output of the scanner 510. The operation of the data handler includes the provision of feedback to the data scanner. As illustrated, in addition to any other functions, the data handler 520 provides a callback control flow 525 to inform the scanner whether particular program elements are desired. For example, the data scanner 510 may encounter a particular data type (such as a type description for a Java class file), and the data handler 520 may inform the data scanner 510 via the callback control flow 525 that annotations for the particular data type are not of interest. Upon being informed via the callback control flow 525 that annotations for the particular data type are not of interest, the data scanner 510 may then skip the elements in the data type.

The data scanner 510 may include a native implementation 540 and a BCEL implementation 545, illustrated with Apache BCEL 550. The implementations may be associated with a parsing module 555 to parse type descriptors and identify the appropriate data types. Also illustrated are the scanner configuration 530 and the data mode configuration 535.

FIG. 6 is a flowchart to illustrate an embodiment of the scanning of a serial data stream for annotations to class file elements. In this illustration, a set of class files is converted into a serial class file data stream 600. The class file data stream may include, but is not limited to, a data stream generated through the traversal of a hierarchical file system or an archive.

The serial class file data stream 600 is received by a scanning module 605, which operates in conjunction with a handling module 610 to identify and output annotations of interest in the data stream. In this process, a particular type description is received in the data stream 615, and the handling module 610 is informed regarding the class type that was encountered. There is then a determination whether the elements of the class type should be processed 620. This determination may be based at least in part on any feedback received from the handling module 610 indicating that the class type is not of interest 655. If the program elements in the class type should not be processed, then the elements in the class type are skipped and the process continues to a determination whether there are more program elements remaining 635. If the class is of interest, then the scanning module 605 scans the program elements for annotations 625.

If any annotations are found 630, then a data stream is provided to the handling module 610 including the annotated element and the annotation value 660. When no more program elements remain in the received data stream, then the process ends 640. The processing of program elements may include other processes not illustrated here, depending on the embodiment of the invention.

FIG. 7 is an illustration of an embodiment of a class file program element as a data stream. In this illustration, a class file program element 706 is shown within a code walk module 702 (used in the traversal of class files). The class file program element 706 is represented by an element type 708 (including an element kind, type name, and type flags), an element field 710 (also including an element name and element flags), and one or more element methods 712 (also including a method signature), and an element parameter 714 (paramldx). The element further includes a class file element record 716 in the code walk implementation 704, including one or more operations (defining field accessors, shallow or deep equality, ordering relation, binary serialization, and XML serialization).

FIG. 8 is an illustration of an embodiment of an annotation to a class file program element presented as a data stream. In this illustration, an annotation is represented in the code walk 802 as a class file annotation value 806, including whether the value is visible at runtime, a type name, and the annotation elements. The class file annotation value 806 is related to a particular named program element 808, which includes the element name. The annotated program element 810 includes the element tag and element value, as well as tag-specific accessors. The program element 810 is shown in relation to the annotation 814, as well as either Boolean 816, char (character) 818, double, float 820, byte, short, int (integer), or long 822. The program element further may include a string 824 or class 826, an enum (enumeration constant) 828, and an array 830. The enumeration constant 828 is illustrated 812 as including an enumeration type and enumeration literal.

The annotation is further illustrated as a class file annotation record 832 in a code walk implementation 804. The class file annotation record 832 includes operations, including shallow or deep equality, the ordering relation, binary serialization, and XML serialization 832. The class file annotation record 832 is shown in relation with the named element 834. Also illustrated are the annotated element 836 and the enumeration constant 838.

FIG. 9 is an embodiment of library utilities. FIG. 9 may illustrate software modules, hardware modules, or modules including a combination of software and hardware. In this illustration, the utilities relate to an interface layer comprising code walk interfaces (code.walk 980); for class file processing and file walk interfaces (file.walk 910) for locating files; and further to an implementation toolbox comprising code processing 950 and a code walk implementation (code.walk.impl 960) for class file processing, and file processing 955 and a file walk implementation (file.walk.impl 930) for locating files.

In the interface layer, the code walk interfaces 980 may include a class file annotation value interface module 982, a class file program element interface module 984, a class file annotation handler interface module 986, a class file annotation scanner interface module 988, a class file dependency scanner interface module 990, and a class file dependency listener interface module 992. The file walk interfaces then may include a file condition interface module 912, a file name classifier interface module 914, a directory walker handler interface module 916, a directory walker interface module 918, a zip walker handler interface module (“zip” indicating use for archives) 920, a zip walker interface module 922, and a file notification interface module 924.

In an embodiment of the invention, the code processing 950 may provide for parsing types from class file descriptors. Code processing 950 may include a class file format helper module 952 and a class file descriptor parser module. The code walk implementation 960 for class file processing may include a class file annotation record module 962, a class file element record module 964, a class file annotation filter 966, a class file annotation for native elements 968, a class file dependencies module for native elements 970, a class file dependencies module for BCEL (Byte Code Engineering Library) elements 972, a class file dependency concentrator module 974, and a class file dependency filter 976.

In an embodiment of the invention, the file processing 955 may include a comma separated value (CSV) formatter and a CSV scanner. The file walk implementation 930 for locating files may include a simple file condition module 932, a basic file name classifier module 934, a directory finder module 936, a directory walker implementation module 938, a walk recorder module 940, a zip (archive) condenser module 942, and a zip walker implementation module 944.

FIG. 10 is an illustration of a computer system in an embodiment of the invention. The computer system may be utilized as a system for processing of computer files in the form of a data stream, or may represent one of multiple systems used in such processing. The computing system illustrated in FIG. 10 is only one of various possible computing system architectures, and is a simplified illustration that does include many well-known elements. As illustrated, a computing system 1000 can execute program code stored by an article of manufacture. Computer system 1000 may be a J2EE system, ABAP system, or administration system. A computer system 1000 includes one or more processors 1005 and memory 1010 coupled to a bus system 1020. The bus system 1020 is an abstraction that represents any one or more separate physical buses, point-to-point connections, or both connected by appropriate bridges, adapters, or controllers. The bus system 1020 may include, for example, a system bus, a Peripheral Component Interconnect (PCI) bus, a HyperTransport or industry standard architecture (ISA) bus, a small computer system interface (SCSI) bus, a universal serial bus (USB), or an Institute of Electrical and Electronics Engineers (IEEE) standard 1394 bus, sometimes referred to as “Firewire”. (“Standard for a High Performance Serial Bus” 1394-1995, IEEE, published Aug. 30, 1996, and supplements thereto)

As illustrated in FIG. 10, the processors 1005 are central processing units (CPUs) of the computer system 1000 and control the overall operation of the computer system 1000. The processors 1005 execute software stored in memory 1010. A processor 1005 may be, or may include, one or more programmable general-purpose or special-purpose microprocessors, digital signal processors (DSPs), programmable controllers, application specific integrated circuits (ASICs), programmable logic devices (PLDs), or the like, or a combination of such devices.

Memory 1010 is or includes the main memory of the computer system 1000. Memory 1010 represents any form of random access memory (RAM), read-only memory (ROM), flash memory, or the like, or a combination of such devices. Memory 1010 stores, among other things, the operating system 1015 of the computer system 1000.

Also connected to the processors 1005 through the bus system 1020 are one or more mass storage devices 1025 and a network adapter 1035. Mass storage devices 1025 may be or may include any conventional medium for storing large volumes of instructions and data 1030 in a non-volatile manner, such as one or more magnetic or optical based disks. In an embodiment of the invention, the mass storage devices may include storage of file or an archive 1032 that requires processing. In an embodiment of the invention, the processors 1005 may operate to traverse the files or archive 1032, the traversal of the files or archive 1032 resulting in output of a serial data stream representing selected elements of the archives The processor 1005 may scan the serial stream for desired program elements within the computer files. In another embodiment the computer system 1000 may provide for the conversion of the computer files into a serial data stream, while another system or systems is responsible for scanning the data stream for desired program elements.

The network adapter 1035 provides the computer system 1000 with the ability to communicate with remote devices, over a network 1040 and may be, for example, an Ethernet adapter. In one embodiment, the network adapter may be utilized to output data including, for example, an extracted serial data stream representing selected elements of the files or archive 1032.

FIG. 11 illustrates an embodiment of a client-server network system. As illustrated, a network 1125 links a server 1130 with client systems 1105, 1110, and 1115. Client 1115 may include certain data storage 1120, including computer files in the form of, for example, a computer file hierarchy or computer archive 1122. Server 1130 includes programming data processing system suitable for implementing apparatus, programs, and/or methods in accordance with one or more embodiments of the present invention. Server 1130 includes processor 1135 and memory 1140. Server 1130 provides a core operating environment for one or more runtime systems, including, for example, virtual machine 1145, at memory 1140 to process user requests. Memory 1140 may include a shared memory area that is accessible by multiple operating system processes executing in server 1130. For example, virtual machine 1145 may include an enterprise server (e.g., a J2EE-compatible server or node, Web Application Server developed by SAP AG, WebSphere Application Server developed by IBM Corp. of Armonk, N.Y., and the like). Memory 1140 can be used to store an operating system, a Transmission Control Protocol/Internet Protocol (TCP/IP) stack for communicating over network 1125, and machine executable instructions executed by processor 1135. The memory 1145 may also include data 1150 for processing, including the processing of data that includes data of one or more computer file hierarchies or computer archives 1152. In an embodiment, the data has been converted into a serial data stream for processing. In some embodiments, server 1135 may include multiple processors, each of which can be used to execute machine executable instructions.

Client systems 1105-1115 may execute multiple application or application interfaces. Each instance or application or application interface may constitute a user session. Each user session may generate one or more requests to be processed by server 1130. The requests may include instructions or code to be executed on a runtime system, such as virtual machine 1145 on server 1130.

In the description above, for the purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the present invention. It will be apparent, however, to one skilled in the art that the present invention may be practiced without some of these specific details. In other instances, well-known structures and devices are shown in block diagram form.

The present invention may include various processes. The processes of the present invention may be performed by hardware components or may be embodied in machine-executable instructions, which may be used to cause a general-purpose or special-purpose processor or logic circuits programmed with the instructions to perform the processes. Alternatively, the processes may be performed by a combination of hardware and software.

Portions of the present invention may be provided as a computer program product, which may include a computer-readable medium having stored thereon computer program instructions, which may be used to program a computer (or other electronic devices) to perform a process according to the present invention. The machine-readable medium may include, but is not limited to, floppy diskettes, optical disks, CD-ROMs (compact disk read-only memory), and magneto-optical disks, ROMs (read-only memory), RAMs (random access memory), EPROMs (erasable programmable read-only memory), EEPROMs (electrically-erasable programmable read-only memory), magnet or optical cards, flash memory, or other type of media/machine-readable medium suitable for storing electronic instructions. Moreover, the present invention may also be downloaded as a computer program product, wherein the program may be transferred from a remote computer to a requesting computer.

Many of the methods are described in their most basic form, but processes can be added to or deleted from any of the methods and information can be added or subtracted from any of the described messages without departing from the basic scope of the present invention. It will be apparent to those skilled in the art that many further modifications and adaptations can be made. The particular embodiments are not provided to limit the invention but to illustrate it. The scope of the present invention is not to be determined by the specific examples provided above but only by the claims below.

It should also be appreciated that reference throughout this specification to “one embodiment” or “an embodiment” means that a particular feature may be included in the practice of the invention. Similarly, it should be appreciated that in the foregoing description of exemplary embodiments of the invention, various features of the invention are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure and aiding in the understanding of one or more of the various inventive aspects. This method of disclosure, however, is not to be interpreted as reflecting an intention that the claimed invention requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims are hereby expressly incorporated into this description, with each claim standing on its own as a separate embodiment of this invention. 

1. A method for processing annotations in computer files comprising: receiving a data stream input at a scanner component, the data stream input representing a plurality of program elements of one or more computer files; scanning the data stream input for annotations; generating a data stream comprising a plurality of annotated program elements and associated annotation values; providing the annotated program elements and annotation values as input to a handler component, wherein the handler component performs one of the handling functions comprising: filtering the annotated program elements and annotation values data, or echoing the annotated program elements and annotation values; and generating an output from the handler component.
 2. The method of claim 1, wherein the output from the handler component is a data stream having the same format as the input to the handler component.
 3. The method of claim 2, further comprising providing the output from the handler component as input to a second handler component.
 4. The method of claim 1, wherein filtering the annotated program elements and annotation values further includes providing feedback regarding which annotations are desired.
 5. The method of claim 1, wherein scanning the data stream input for annotations is based at least in part on the feedback regarding which annotations are desired.
 6. The method of claim 1, wherein the output of the listening component is an echo of the input data to the listening component.
 7. The method of claim 1, wherein the one or more computer files comprise Java class files.
 8. An annotation processing system comprising: a data scanning module, the data scanning module to receive a data stream input containing a plurality of program elements and to scan the data stream input to identify annotated data elements and annotation values; and a data handler module, the data handler module to receive the identified annotated data elements and annotation values, the data handler module being one of the following: a stream processing handler to filter annotations and to generate feedback to the data scanning module regarding annotations that are needed; or an echo mechanism to echo the annotated data elements and annotation values.
 9. The system of claim 8, further comprising a second data handler module, wherein an output of the data handler module is an input to the second data handler module.
 10. The system of claim 9, wherein a format of the output of the data handler module is the same as the format of the input to the data handler module.
 11. The system of claim 8, wherein the stream processing handler comprises one or more configurable parameters for a filter condition.
 12. The system of claim 8, wherein the echo mechanism produces a text stream for a user.
 13. The system of claim 7, wherein the echo mechanism produces text for external processing.
 14. An article of manufacture comprising: a computer-readable medium including instructions that, when accessed by a processor, cause the computer to perform operations comprising: receiving a data stream input at a scanner component, the data stream input representing a plurality of program elements of one or more computer files; scanning the data stream input for annotations; generating a data stream comprising a plurality of annotated program elements and associated annotation values; providing the annotated program elements and annotation values as input to a handler component, wherein the handler component performs one of the handling functions comprising: filtering the annotated program elements and annotation values data, or echoing the annotated program elements and annotation values; and generating an output from the handler component.
 15. The article of manufacture of claim 14, wherein the output from the listening component is a data stream having the same format as the input to the listening component.
 16. The article of manufacture of claim 15, where the listening function of the listening component is either filtering the type definition and type usage data or aggregating the type usage data.
 17. The article of manufacture of claim 15, wherein the medium further includes instructions that, when accessed by a processor, cause the computer to perform operations comprising: providing the output from the listening component as input to a second listening component.
 18. The article of manufacture of claim 14, wherein the output of the listening component is an echo of the input data to the listening component.
 19. The article of manufacture of claim 14, wherein the one or more computer files comprise Java class files. 